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Abstract 

Biochemical systems involving a high number of components with intricate interactions often lead to complex models 
containing a large number of parameters. Although a large model could describe in detail the mechanisms that underlie the 
system, its very large size may hinder us in understanding the key elements of the system. Also in terms of parameter 
identification, large models are often problematic. Therefore, a reduced model may be preferred to represent the system. 
Yet, in order to efficaciously replace the large model, the reduced model should have the same ability as the large model to 
produce reliable predictions for a broad set of testable experimental conditions. We present a novel method to extract an 
"optimal" reduced model from a large model to represent biochemical systems by combining a reduction method and a 
model discrimination method. The former assures that the reduced model contains only those components that are 
important to produce the dynamics observed in given experiments, whereas the latter ensures that the reduced model 
gives a good prediction for any feasible experimental conditions that are relevant to answer questions at hand. These two 
techniques are applied iteratively. The method reveals the biological core of a model mathematically, indicating the 
processes that are likely to be responsible for certain behavior. We demonstrate the algorithm on two realistic model 
examples. We show that in both cases the core is substantially smaller than the full model. 
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Introduction 

Biochemical networks are often very complex. The complexity 
may arise from the large number of components involved in the 
network and/ or from their intricate interactions. When such 
systems are modeled by differential equations, we obtain a large 
non-linear differential equation system with many parameters. 
There are some advantages for having a large model, e.g., it may 
capture in detail the mechanisms of the system and therefore 
might give accurate predictions. On the other hand, model 
complexity also gives rise to severe problems, e.g., hard 
understanding of system behavior under varying conditions; long 
computing times, especially in case of stiff models; and parameter 
identification problems, especially in the case of limited data 
availability. To overcome these issues, reduced models that still 
capture the essential features of the system are highly desirable. 

Several methods for model reduction are already available, e.g., 
time-scale separation [1-4], sensitivity analysis [5-7], and lumping 
[8,9]. These methods typically require prior knowledge of the 
parameter values of the model before they can be applied. 
Therefore, only the first two above-mentioned problems might be 
remedied in this way, whereas the problem of parameter 
identification, which is often the most problematic issue in systems 
biology, remains. In addition, some of these methods may lead to 
reduced models that are structurally different from the original 
one. This is because a component in the new reduced model may 



be a combination of several components in the original model, or a 
component in the original model could be contained in several 
components of the reduced model. This usually obstructs the 
biological interpretation of the reduced model. 

In previous work we developed a reduction method to simplify 
biochemical models in systems biology [10]. This method is based 
on the so-called "admissible region" concept, i.e., the set of 
parameters for which the mathematical model yields some 
required output. This concept reflects the parameter uncertainty 
that commonly occurs in systems biology models. In contrast to 
the methods mentioned above, our method does not require prior 
knowledge of the parameter values. It also does not require a 
transformation, so that the reduction result is directly interpret- 
able. However, our procedure to construct a reliable reduced 
model was not yet complete. The method only makes use of data 
which were obtained from experiments under specific conditions. 
The behavior of the system under conditions that are different 
from these experiments, might not be well predicted. 

In this paper we repair this shortcoming by presenting a novel 
approach to extract a reliable reduced model from a full model 
under a large variety of experimental conditions. The proposed 
approach combines a reduction method and a model discrimina- 
tion method. By combining these two methods, we arrive at a 
simpler model that still has powerful prediction capabilities. This 
in turn will help us in understanding the behavior of the complex 
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system, since such a reduced model apparently contains the core of 
the mechanisms underlying the system dynamics. 

Materials and Methods 

Consider a biochemical network for which the dynamics of its n 
components is modeled by a system of ordinary differential 
equations (ODEs) 

Tt =f(xk ' e) (i) 
y =g(x) 

with initial values 

x(; = 0) = x 0 . (2) 

Here, x e R" represents the concentration of the species in the 
network, k e R m is the parameter set in the model, yel' stands 
for the model output with 1 < q < n, and eeR p represents the 
experimental conditions under which the model output y is 
measured. Throughout this paper, the components of y are 
referred to as "the target components" of the system. The 
measured data for y are denoted by y. 

In practice, a common approach to estimate the parameter set 
k = (k\,k.2, . . . ,k m ) is by fitting model (l)-(2) to an initial dataset y. 
In this stage, the dataset that we have is usually very poor and thus 
the parameters that are found are not yet well identified, i.e., their 
values are yet rather sloppy. Therefore, a new experiment, based 
on optimal experimental design, is then carried out to obtain a 
new dataset and the parameter estimation is repeated. These steps 
are applied iteratively until all parameters can be hopefully 
identified, as depicted in Figure 1A. Unfortunately, in most cases, 
it is very difficult to identify all of them. This especially happens if 



the number of parameters is large. In those cases, it is convenient 
to work with a simpler model with less parameters so that 
parameter identification can be carried out efficaciously. 

Although a reduced model contains less components and/or 
parameters than the original model, it is important that it should 
still be able to reliably predict the behavior of the system for any 
feasible experimental conditions that are considered relevant to 
answer questions at hand. Only in this case, the reduced model 
can replace the full model and fully represent the system. Thus, for 
example, if the initial condition of a particular biochemical species 
Xj in the experiment can be in the range of a<x,(0)<A, then the 
behavior of the system should be well predicted by the reduced 
model for any initial condition x,(0) e [a,b] . Also, if a particular 
perturbation can be applied in an experiment, e.g., deletion and/ 
or knock-out of some genes, the behavior of the perturbed system 
should still be well predicted. In this paper, the set of all feasible 
experimental conditions is denoted by E. A reduced model that 
does not contain redundant component and/or parameter and 
can reliably predict the dynamics of the target components for any 
e e E is referred to as "an optimal model". 

To extract an optimal model, we combine our reduction 
method [10] with a model discrimination method. The procedure 
is sketched in Figure IB. The essence of this scheme is that for the 
obtained reduced model, it is investigated whether an experimen- 
tal condition can be found for which the reduced model yields an 
outcome that is significantly different from what the full model 
would predict. If this is the case, the reduced model is not accepted 
as being "optimal". 

Notice that parameter estimation is frequently used in our 
procedure. For this aspect a vast body of separate literature exists, 
e.g., [11-16]. In our calculation, we made use of the nonlinear 
least square solver from MATLAB which is a local optimization 
algorithm. 




(A) 



(B) 



Figure 1. Approaches to estimate parameter in Systems Biology. (A) Common approach, (B) Proposed approach to yield optimal model with 
fewer parameters. 

doi:1 0.1 371 /journal.pone.0083664.g001 
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Model reduction 

After having obtained a parameter estimate k for the full model, 
the first step to obtain an optimal model is to reduce the model 
complexity by removing redundant components and/or param- 
eters that do not contribute to the dynamics of the target 
components. For this purpose, we have developed a reduction 
method in [10] that utilizes the concept of admissible region. In 
this method, the parameters that are removed are those that are 
badly identified. The method does not necessarily require prior 
biological knowledge. However, the method can easily be tuned to 
incorporate prior knowledge, if this is available. The main features 
of our reduction method are summarized below. 

Admissible region. Suppose that M time-series data of the 
target components y'(?,,e') are obtained from experiments, which 
were conducted under M different experimental conditions e', 
with /= 1, . . . ,N, and 1=1,... ,M. We measure the distance of 
the model output (target components) to the time-series data using 
the following function 



S(y(t,k,e),f(t,e)) = 



Notice that S is the least squares measure that can be interpreted 
as the average squared deviation between the model prediction 
and the data. 

Let us introduce a tolerance s 2 which indicates how much 
difference we accept as discrepancy between the data and the 
model prediction. In many cases, the variance of the noise from 
the experimental data can be used as a guidance to choose a 
suitable value for e. Then, any parameter vector k such that 



S(y(r,k,e),y(r,e))< £ 2 



(4) 



is acceptable to represent the parameters of the system, since it is 
capable of producing the dynamics within the required accuracy. 
We say that all parameter vectors k that satisfy (4) constitute the 
so-called "admissible region" (AR). Thus, 



AR={keU m \ S(yO,k,e),y(*,e)) < e 2 } . 



(5) 



Notice that the region AR reflects the parameter sloppiness in the 
model, i.e., different parameter sets may yields the same model 
dynamics [17]. Additionally, the broad admissible region implies 
that the model encounters a practical identifiability problem [18]. 

Reduction method. Since all parameter vectors in the 
admissible region yield an acceptable dynamical behavior of the 
system, the shape of the AR may suggest whether reduction is 
possible. For example, if the admissible region includes a part of a 
parameter axis, then this parameter can apparendy be set to zero 
and could thus be excluded from the model. If the region extends 
to infinity in a certain parameter direction, then some terms or 
state-variables in the ODEs might be lumped. This analysis may 
thus lead to a simpler representation of the dynamics of the 
biochemical system. 

Describing the admissible region and deducing the possible 
reductions is relatively easy for a small system, as shown in [10]. 
However, applying such analysis to a model with many 
parameters, which is typically the case in systems biology, can 
be very complicated. Fortunately, we notice that in practice it is 
not necessary to construct the admissible region completely. If one 
(or several) parameter(s) can be set to zero (or infinity) and the 
others can be re-optimized such that the resulting parameters 



k r e R"' are still in the admissible region, then the model can be 
simplified. This reduction procedure can be carried out in a 
systematic way by applying first node reduction, then parameter 
reduction, and finally node lumping, as we will shortly discuss 
below. 

Node reduction. First, we try to remove redundant nodes, 
one at a time. Here, e.g., node X\ can be removed from the system 
if it can be eliminated in all equations and the parameters can be 
re-optimized such that (4) is satisfied. This procedure is repeated 
for If one or more nodes have been removed, we cycle 

again through the remaining nodes and repeat the procedure until 
no further nodes can be removed. 

Parameter reduction. To see whether a parameter, k\ say, 
can be removed, we simply set k\ = 0 and re-estimate the other 
parameters. If (4) is satisfied, then indeed k\ can be removed from 
the model. Next, this procedure is repeated for ki, ■ ■ ■ ,k m . If one 
or more parameters have been removed, we cycle again through 
the remaining parameters and repeat the procedure until no 
further parameters can be removed. 

Since the approach is heuristic, the result of the reduction might 
depend on the parameter ordering and might be not unique. In 
principle, all reduced models obtained this way are acceptable. 
However, for reasons of parsimony, the strongest reduction is 
preferable. For this purpose, we order the parameters based on the 
sensitivities 



kj_8S_ 
~S~dkj' 



(6) 



Although these parameters sensitivities are local quantities, we 
found that in general they give a very good reduction rate, see 
[10]. 

Lumping. If a parameter that represents the strength of a 
reaction can be set at a very large value and the others can be 
adjusted to satisfy (4), it may indicate that the corresponding 
reaction can be considered as instantaneous. This implies that the 
two corresponding nodes that are connected by the reaction can 
be lumped, and hence may be replaced by one node. The 
procedure for lumping essentially follows the same steps as 
mentioned under parameter reduction. 

Model discrimination 

Suppose that from the model reduction procedure above, we 
obtain a reduced model 



= f,.(x f ,k,,e) 



dt 

y r = g(x r ,k r ,e) 



(7) 



where x r ,f r e U R with R<n, and y,. e W denotes the dynamics of 
the target components of the reduced model. Thus, we now have 
two different models to describe the measured target components, 
i.e., the full model and the reduced model. The next step is to 
investigate whether the reduced model will generate the same 
prediction as that of the full model for any feasible experimental 
conditions that are considered relevant to answer questions at 
hand. Only if all these predictions agree sufficiently, then we 
conclude that the full model can be replaced by the reduced 
model. 

To check the predictive power of both models, an optimal 
experimental design approach to discrimate the models is utilized 
[19-22]. Here, we look for an experimental condition e on E and 
time sampling t that maximize the distance between the full and 
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reduced models in terms of the distance function S in (3). 
Mathematically, this can be written as 

argmax [S(y(f,k,e),y,(f,k,.,e))]. (8) 

eeE 

We say that a reduced model cannot be distinguished from the 
full model if their distance satisfies 

S(y(;,k,e),y r (f,k r ,e))<<5 2 , f 0 r any eeE. (9) 

with S a value that denotes the tolerance criterion. The value for 5 
must be chosen by the modeller. Since S represents the worst 
deviation between the predictions of the reduced and the full 
models, the smaller the value of S is, the more powerful the 
reduced model will be. However, 8 should be chosen larger than g, 
because otherwise we might end up with modeling noise. 

Model reduction and model discrimination applied 
iteratively 

Here, we discuss the essential features of our algorithm to obtain 
an optimal model. For illustrational purposes we sketch in Figure 2 
the parameter space of a system with only two parameters. The 
admissible region shown in Figure 2 contains the parameter 
vectors for which the full model produces the required result, 
within the specified tolerance. Note that the candidates for the 
parameter set of the reduced model are those that lie on the 
parameter axes within the admissible region. 

To find the optimal model, we could in principle compare each 
parameter set candidate of the reduced model in AR to the full 
model under all feasible experimental conditions. In practice, this 
is impossible. Therefore we apply an iterative algorithm. We first 
apply the reduction method in [10] and obtain a reduced model 
that at least for the measured target components shows the same 
behavior as the full model. Next, we compare this particular 
reduced model with the full model under all feasible conditions 
and select the condition for which the difference is biggest by 
applying hybrid optimization (a combination of global and local 
optimization). This optimization is carried out many times to make 
sure that the condition that we find is close to the global 
maximum. This is called "discrimination". 



'2' 








ARi ^\ 






v \ AR 2 \ 




0 





Figure 2. Illustration of an admissible region for a system with 
two parameters. Initially, the admissible region of the system is AR\. 
In this situation, a reduced model can be obtained either by setting 
/ci=0 or ^2 = 0. When a new dataset from a new experiment is 
incorporated, the admissible region shrinks to ARi- Thus, AR2<=AR\. 
Now, a reduced model can only be obtained when /c2 = 0. 
doi:1 0.1 371 /journal.pone.0083664.g002 



Normally, this difference is still huge in this first step. Then, we 
add the data according to this new experimental condition to our 
dataset. In the second step this extended dataset is used as starting 
point. This second step starts with calculation of updated 
parameters k for the full model. Then, the reduction method 
from [10] is applied leading to a new reduced model. If this second 
reduced model is compared to the full model under all feasible 
conditions, one usually finds that the difference in model 
predictions becomes smaller than in the first step of the algorithm. 
The procedure is repeated until this difference between reduced 
model and full model is smaller than the threshold for all feasible 
experimental conditions. When the optimization procedure only 
yields conditions that make the deviation always less than our 
tolerance in (9), we accept that the reduced model approximates 
the full model everywhere on E. This resulting model for which 
this holds is called "optimal". 

Algorithm 

In summary, the method that we propose consists of the 
following steps: 

1 . Obtain data from experiment. 

2. Estimate the parameters of the full model. 

3. Apply reduction to the full model. 

4. Try to discriminate the resulting reduced model obtained in 
step 3) from the full model obtained in step 2). 

5. If there indeed exists an experimental condition that can 
discriminate them, add the data according to this condition to 
the dataset and repeat step 2)-4). Otherwise, an optimal model 
has been obtained. 

Results 

In our view, model reduction and model discrimination should 
be an integral part of the modelling-experimental cycle. When 
model discrimination identifies an experimental condition to 
separate the reduced model from the full model, then the 
corresponding experiment should be carried out in the lab. 
However, in order to show how the proposed approach may work 
out in practice, here we use a different approach. The method is 
applied to two established models from literature: a flowering 
model that describes the genetic interactions underlying flower 
development, and an EGFR network model of a signaling 
transduction network. Both models have been published with a 
full parameter set, and in this sections we adopt the outcome of 
these models (with some additional noise) as experimental result. 
Note that we do not use these published parameters in our own 
fitting/ reduction/discrimination algorithm. For simplicity, here 
we used the same sampling times t for all experiments. In practice, 
one might have different sampling times for each experiment. In 
this way, the choice of sampling times could be part of the 
experimental setup. 

Flowering network model 

The dynamic model from [23] describes the genetic interactions 
of five types of MADS genes underlying flower development. The 
expression patterns of these genes are associated with floral organ 
identity via the so-called ABCDE model [24]. Of the four floral 
organ types in Arabidopsis, sepals are linked to high expression of 
the A gene, petals with A and B, stamens with B and C, and 
carpels (including ovules) with C and D. All organs require high 
expression of the E gene. The genes that represent the five 
ABCDE functions in this model, are API (A), AP3 (B), PI (B), AG 
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Figure 3. Graphical representation of the genes interactions of flowering in Arabidopsis. (A) The full model, (B) The reduced model. 
doi:1 0.1 371 /journal.pone.0083664.g003 



(C), SHP1 (D) and SEP (E). Gene expression is modeled as protein 
concentration, and the genes interact in this model via pairs of two 
proteins (dimers) that regulate each other's genetic transcription 
rate, see Figure 3. The network dynamics differentiates between 
the floral whorls via location-specific trigger mechanisms. 

The model consists of 6 state variables (representing proteins 
and dimers), and 37 parameters representing the rates of the 
biochemical interactions (see supplements for details). The model 
is able to generate realistic predictions for the following 
experimental conditions 

E= {e 1 = wildtype, e 2 = knock — out AP3, e 3 = knock — out PI, 
e 4 = knock — out AG, e 5 = ectopic expression of AP3, (10) 
<? 6 = ectopic expression of AG} . 

For knock-out experiments, the initial and production terms of the 
corresponding genes are set to zero. For ectopic expression, we 
assume that the initial concentrations of the corresponding genes 
are available abundantly; therefore they are set to a high level. In 
this case, AP3(t = 0)=\0 3 whereas AG(t = 0) = 5 x 10 3 . Thus, 
there are only six feasible experiments: wild-type experiments; 
setting initial and production term of AP, PI, and AG to zero; and 
setting initial concentration of AP3 and AG to a specific high 
value. These feasible experiments are taken from [23]. 

To show how our proposed method can be applied to a real 
biological system, we assume that the parameters in the model are 
unknown and have to be estimated from the experimental data. 
The six measured gene expressions are regarded as our target 
components in this example. We further assume that the 
measurements can be conducted at the conditions in (10). Then 
we investigate whether the structure of the proposed full model 
contains redundancy and hence, whether the dynamics can be 
described by a simpler model. 

As experimental outcome, we generate the data of the target 
components using the full model and the parameters in [23], 
adding relative normal random noise of 20%. The measurements 
are assumed to be performed at t = 0,1,2,3,4,5 days. 

For the initial dataset, the wild-type measurement is carried out. 
The results are shown in Figure 4, denoted by '*'. If we set e = 20% 
we found that the dataset can be well represented by the model 
with many possible parameter sets, one of them is k = k|. When 
reduction is applied, it turns out that 7 out of 37 parameters can be 
removed from the model and yet the reduced model can still fit the 
dataset as shown by the dashed lines in Figure 4. This parameter 
set is denoted by k = k' . 

Thus, we have two possible models to represent the behavior of 
the wildtype, i.e., the full model and the reduced model. Following 



the algorithm described in the Methods section, model discrim- 
ination is applied to find which experiments from (10) can 
discriminate the two models. Setting the tolerance criterion in the 
model discrimination equal to 3 = 40%, we find that experimental 
condition 

e 4 = knock - out AG (11) 

distinguishes the reduced model from the full model, as shown in 
Figure 5. In this case, the distance between the reduced and the 
fuU models is SXy(f,k},e 4 ),y r y(a^e 4 )«9.6 x 10 6 . Therefore, this 
experiment is then carried out to confirm this difference, and we 
obtain the dataset denoted by '*' in Figure 5. 

From the new experiment, we found that the predictions from 
both the reduced and the full model cannot represent the behavior 
of the mutant. Therefore, the parameters in the full model should 
be re-estimated to fit both the wild-type and the mutant AG 
dataset and the reduction should also be repeated. Then, we 
obtain a new parameter set k = k 2 for the full model and a reduced 

parameter set k = k 2 . In this new reduced model, 6 parameters can 
be reduced. 

Given the list of possible experiments in (10) and the threshold 
value S = 40% for model discrimination, we found that the new 
reduced model now cannot be discriminated anymore from the 
full model. Thus, we may claim that the reduced model can 
replace the full model to represent all system behavior that we are 
interested in. This is underpinned by the fact that all datasets that 
are obtained for all conditions in (10) can be predicted very well by 
the reduced model, as shown in Figure S 1 . The parameter values 
for the full and optimal models are shown in Table 1. 

The reduction allows for an interesting conclusion: all 
interactions that originate from dimer [API, SEP] are not needed 
to explain the behavior of the system under the conditions in (10). 
Thanks to reduction, we found that these interactions can be 
replaced by a constant production term. We conclude that the 
reduced model in Figure 3B is the core network that is responsible 
for the dynamics under conditions (10). 

EGFR model 

Next, we apply our method to the epidermal growth factor 
receptor (EGFR) model from [25], of which the network is shown 
in Figure 6A. This model describes the cellular response to an 
epidermal growth factor (EGF) stimulation. The model consists of 
23 biochemical components with 25 chemical reactions, described 
by ordinary differential equations (ODEs). This results in an ODE 
system with 23 state variables and 50 parameters. Since the kinetic 
scheme contains several cycles, the kinetic parameters involved in 
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Petals 




Days 

Stamens 



Days 

Carpels 



* 


AP1 


* 


AP3/PI 


* 


AG 


* 


SHP 


* 


SEP 




Figure 4. The concentration dynamics of the proteins. These proteins are part of dimer complexes in each of the four organ initiation sites for 
the wildtype dataset. The solid and dashed lines show that the dataset can be fitted by the full model with k = k' as well as by the reduced model 
with k = kj. 

doi:1 0.1 371 /journal.pone.0083664.g004 



the cycles satisfy the so-called "detailed balance" relationships 
given by 



phosphorylated PLC}', Grb2 bound in She, and Grb2 bound in 
EGFR, which are composed of several species in the model: 



k 9 -k l0 -k n -k l2 
fc_9-fc_io-fc-irfc-i 



k\5-k2vk-irk-y, 
k-i5-k-2\-kivky, 



k\g-k22'k-\9-k_2t 
k-ix-k-22'ki9'k2t 



^12'^22'^21'^23 



-12'K-22''C-21'K-23 



&15'£-20'&-23'^-24 
^-15'^20'^23'^24 



1. 



(12) 



(13) 



(14) 



(15) 



(16) 



The parameter values of the full model are given in Table 2 in 
[25]. 

To validate their model, the system was stimulated with different 
EGF stimulations (20 nM, 2 nM, and 0.2 nM) and the resulted 
transient response of several proteins were measured. These are the 
concentrations of phosphorylated EGFR, phosphorylated She, 



Total phosphorylated EGFR = 2([RP] + [R - PL] + 
[R-PLP] + [R-G] + [R-G-S] + [R-Sh] 

+ [R-ShP] + [R-Sh-G] + [R-Sh-G-S]) 



(17) 



Total phosphorylated PLCy = [R - PLP] + [R - PLCyP] (18) 



Total phosphorylated She = [R - ShP] + [R - Sh - G] + 
[R-Sh-G-S] + [ShP] + [Sh-G] + [Sh-G-S] 

Total Grb2 bound to EGFR = [R - G] + 
[R-G-S] + [R-Sh-G] + [R-Sh-G-S] 

Total Grb2 bound to Shc= [R-Sh-G] + 
[Sh-G] + [R-Sh-G-S] + [Sh-G-S]. 



(19) 



(20) 



(21) 



The model was then used to predict the dependency of the 
transient responses on the relative abundance of some signaling 
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Figure 5. Model discrimination applied to the first full model and reduced model. Model discrimination finds that knocking out AG will 
distinguish the reduced model with k = k' from the full model with k = k|. When an experiment is conducted for this mutant, we obtain the dataset 
that is denoted by 

doi:1 0.1 371 /journal.pone.0083664.g005 



Table 1. Parameter values of the full and optimal models in 
the flowering model. 



Parameter 


k 2 




Parameter 




k; 




6.6e4 


6.6e4 


hi 


1.4e4 


1.7e4 




5 


0 


Km4\ 


7 


13 


Km\ 2 


3.8e2 


2.7e2 


hi 


1.3e4 


1.4p4 


dcj 


71 


70 




81 


0 


hi 


3.3e4 


3.3e4 




6.9e2 


4.3e2 


Km 2y \ 


6.2e2 


6.2e2 


dC4 


5.1e2 


6.1e2 


hi 


76 


88 


P4 


4.4£-3 


5.5e3 




4e2 


0 


hi 


4.2?2 


4.2e2 


hi 


49 
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proteins, that is when the initial concentration of She was 
decreased by a factor of 4, the initial concentration of Grb2 was 
increased by a factor of 4, and when the initial concentration of 
EGFR was increased by a factor of 4. 

To show how our method can be applied to this biological 
system, we assume that the parameters in the model are unknown 
and should be estimated from experimental data. The measured 
responses are regarded as our target components in this example. 
Since the target components were measured and predicted for 
different EGF stimulations and different initial conditions of 
EGFR, She, and Grb2, we assume that the relevant feasible 
experiments are to vary the stimulations and initial conditions of 
these species. Thus, 

E ={(EGF stimulation , EGFRo, Grb2 0 , Shc 0 ) | 

0 < EGF stimllla t ion < 20nM, 0 < EGFR 0 < 400nM, (22) 
0<Grb2 0 <340nM, 0<Shc 0 < 150nM}, 

where EGFRo, Grb2o, and Shco are the initial concentrations 
(concentrations at t = 0) of [EGFR], [Grb2], and [She] respec- 
tively. Notice that the space of feasible experimental conditions is 
very large. In model discrimination procedure, for this example, 
we utilize a hybrid optimization tool (combination of genetic 
algorithm and local optimization) in Matlab to obtain experimen- 
tal conditions that maximizes the difference between the reduced 
and the full models. 

As experimental outcome, we generate the data of the target 
components using the full model and the parameters in ([25]), 
adding a relative normal random noise of 5%. The measurements 
are assumed to be performed at t = 0, 15, 30, 45, 60, 120 seconds. 



PLOS ONE | www.plosone.org 



7 



January 2014 | Volume 9 | Issue 1 | e83664 



Identifying Optimal Models 



G-S 




(B) 



R-PL ATP 




Figure 6. The EGFR biochemical network. A solid arrow represents a reaction with two kinetic parameters and a dashed arrow represents a 
reaction with one kinetic parameter. (A) The full network from [25], (B) The optimal network to produce the dynamics of the five target components 
for any experimental condition eeE in (22). 
doi:1 0.1 371 /journal.pone.0083664.g006 



For the initial dataset, we assume that it is obtained from 
experiments which are carried out at two different EGF 
stimulations, [EGF] = 20 nM and [EGF] = 2 nM. The other three 
initial conditions are set to [EGFR],, = 100 nM, [Grb2] 0 = 85 nM, 
and [Shc]o= 150 nM. Thus, the initial dataset is obtained from 
the experiments with conditions 



(EGF stlmulation = 20nM, EGFR 0 = lOOnM, 
Shc 0 = 150nM,Grb2 0 = 85nM), 
(EGF stimulation = 2nM, EGFR 0 = lOOnM, 
Shc 0 = 150nM, Grb2 0 = 85nM). 



(23) 



The dynamics of the target components are shown in Figure 7, 
denoted by '+' and 'x'. If s. in (4) is set to £ = 5% and parameter 
estimation is applied, we find that the dataset can be well 
represented by the model with many different parameter sets; one 
of them is k = ki. When reduction is applied, it turns out that 33 
out of 50 parameters can be removed from the model. This 
parameter set is denoted by k = k' . The reduced model can fit the 
dataset quite well, as shown in Figure 7. 

Applying model discrimination, we find that with the experi- 
mental conditions 



e 2 = (EGF stlmulatlon = 15.3824nM, EGFR 0 = 141nM, 
Shc 0 =0nM, Grb2 0 = 340nM), 



(24) 



the reduced model can be clearly distinguished from the full 
model, as can be seen in Figure S3. Their distance in this case is 
5(y(a/,e 2 ),y r (r,k',e 2 ))«3.1xl0 6 . 



To obtain an optimal model, we follow the procedure outlined in 
Section of Algorithm. First, experiment e 2 is performed to obtain a 
new dataset. So, the new dataset now consists of the combined 
dataset obtained with the experimental conditions e la , e"'and e 2 . 
Parameter estimation and model reduction are now carried out to 
the combined dataset. This procedure yields k = k 2 for the 
parameter set of the full model and k = k 2 for the parameter set 
of the reduced model. The number of parameters that can be 
reduced turns out to be 3 1 . Re-performing the model discrimina- 
tion, we found that the experimental condition that maximizes the 
distance between the full model and the reduced model is now e = e 3 
with distance S(y(?,k 2 ,e 3 ),y r (?,k^,e 3 )) = 23.5362. Setting the 
threshold value for model discrimination at S = 25%, we find that 
after including four additional experimental conditions the reduced 
model with k = kj? cannot be distinguished anymore from the full 
model with k = k^. The network of the optimal model is shown in 
Figure 6B and the iterative process to obtain the optimal model is 
shown in Figure S4. 

In the optimal model in Figure 6B, 24 parameters can be set to 
zero while one parameter, namely can be set to a very large 

value. The latter implies that the phosphorylation of [R-Sh] occurs 
very fast, and therefore, the components R-Sh and R-ShP can be 
lumped into one biochemical component in the optimal model. 
Thus, we end up with a model that consists of 17 biochemical 
components with 25 kinetic parameters. This result shows that we 
may remove six redundant components and 25 redundant 
parameters from the original model. The prediction for the five 
target components from the reduced model would then deviate at 
most 25% from that of the full model for any experimental 
condition in (22). 
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Time 



Figure 7. Dynamics of the target components for the start up dataset. The solid and dashed lines show that the dataset can be fitted by the 
full model with k = k| as well as by the reduced model with k = k'. 
doi:1 0.1 371 /journal.pone.0083664.g007 



As a validation, a number of experiments are performed with 
different random experimental conditions and the dynamics of the 
target components are predicted by the reduced model, as shown 
in Figure 8. The results show that the predictions of the reduced 
model are in very good agreement with the dynamics obtained 
from the experiments. Only in the first experiment, the prediction 
for Grb2 bound to She slighdy deviates from the measurement. 
However, the deviation is still acceptable. We, therefore, conclude 
that the reduced model in Figure 6 B with parameter set k = kj? is 
an optimal model to produce the dynamics of the five target 
components, given the threshold value of 8 = 25%. The param- 
eters of the full and optimal models and the list of experimental 
conditions used to obtain the optimal model are shown in Table 
SI and Table S2. 

Ras pathway. The reduction results for the EGFR network 
allow for some nice conclusion. For example, a number of 
chemical reactions that lead to the activation of Ras protein via 
SOS are not longer preserved in the optimal model. Mathemat- 
ically speaking, this implies that the parameters that are related to 
these reactions cannot be identified by only measuring the five 
target components above. From a biological point of view, deleting 
this branch is rather pointless since the function of EGFR is to 
activate the Ras^Raf-^Mek^ERK cascade [26]. To preserve 
this chemical pathway in the optimal model, one should think 
carefully which complex protein(s) should be treated as additional 
target component(s), or alternatively, which constraint in the 
reduction should be taken into account. In other words, prior 
knowledge should help us to prevent undesired results. Fortunate- 



ly, our method can easily be tuned to incorporate prior knowledge. 
In this specific case this could be done as follows. 

We observe from the network in Figure 6A that one possibility 
to maintain the pathway to the Ras protein is by preventing one of 
the incoming reactions to R-Sh-G-S or R-G-S from elimination. 
In practice, this can be established by preserving one of the 
following parameters from being zero, namely kio,k-n,ki9,k-20 
or &24- If we use this condition as our constraint in the reduction, 
we obtain the optimal model that is depicted in Figure S2G. Here, 
the activation of Ras protein can be achieved via R-Sh-G-S. 

Also in this optimal model, parameter k_ 14 can be set to a very 
large value and thus R-ShP and R-Sh can be lumped into one 
biochemical component. The model contains 28 kinetic param- 
eters, so about 44% of the kinetic parameters in the original model 
are redundant to represent the dynamics of the five target 
components. The iterative process to obtain this model is shown in 
Figure S5. As can be seen, the optimal model is now obtained after 
six new experiments have been performed. The parameters of the 
full and optimal models and the list of experiments to obtain the 
optimal model are shown in Table SI and Table S3. Notice that in 
this reduced model, the parameters in the branch that was 
preserved by using prior knowledge are not identified very well 
from these experimental data. 

Discussion 

In Systems Biology, we often face the problem of non- 
uniqueness: several models can describe measured data equally 
well. In such a situation one may sometimes choose between a 
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Figure 8. Validation of the optimal model with k = k|J. The data, marked with "*", "+", and "x", are obtained from random experimental 
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model that includes a lot of details of the underlying mechanisms 
but is complex, very time consuming, and might be over- 
parameterized, or a reduced version that is much more convenient 
to handle, but might have less predictive power. When modelling a 
system that is a part of a large intricate environment, one should 
be careful in identifying components/interactions can give 
meaningful contributions to answer ones questions. It is tempting 
to add components, and thus parameters, to the model that are 
actually not required in governing the dynamics of the observed 
components. This may result in difficulty in identifying the 
parameter values, while at the same time the understanding of the 
key functionality may be obscured by superfluous details. 
Therefore, for the sake of understanding the system, speeding up 
the computation, and parameter identification, a simpler model is 
usually more favorable. However, since a simpler model contain 
less detailed mechanisms, its predictive power might be less 
reliable. Therefore, model reduction requires a careful approach. 

At least two conditions must be satisfied by a reduced model to 
replace the full model. First, it should be able to fit the observed 
data; and second, it should have the same power as the full model 
to reliably predict the behavior of the system under different 
feasible experimental conditions that are considered important to 
answer questions at hand. The reduction technique in our 
approach assures the first requirement and the discrimination 
method ensures the last requirement. The model discrimination in 
our method can also be viewed as a way to verify whether the 
omitted components, reactions, and/ or parameters in the reduced 
model give a meaningful contribution to the model prediction. If 
they do, the dataset from a new experimental condition will 
confirm this so that in the next reduction, the method cannot 



remove the corresponding components and/or parameters. The 
resulting model is a trade-off between reliability and simplicity: it 
does not contain redundant components, but has enough 
predictive power to reliably predict the behavior of the system. 
Thus, with the proposed method, the redundant components can 
be easily detected and removed so that at the end, our model only 
contains components and parameters that are essential in 
generating the required predictions of the system. Obviously, the 
optimal model contains the core mechanisms that underlie the 
behavior of the biological system. 

Note that when we apply parameter estimation of the full model 
and model reduction for the first time, the initial data set may 
come from several experiments. Our example on EGFR model 
shows this (equation (23)) where the initial data were obtained 
from two experiments with different EGF stimulations. Obviously, 
one should make sure that the initial data set contains sufficiently 
rich information for parameter inference, but this is beyond the 
scope of our paper. After discrimination procedure, however, we 
do recommend to carry out only one single new experiment in 
every iteration, as it will be enough to falsify the result of the 
previous full model or reduced model. Furthermore, doing one 
experiment at a time will avoid doing experiments that are likely to 
be superfluous. 

Identifying parameters with high accuracy is difficult. There- 
fore, in [17] it is suggested to focus on model prediction rather 
than on parameter identification. In line with this, our approach 
minimizes the discrepancy between the model prediction from the 
reduced model and that of the full model. The remaining 
parameters in the reduced model might still have large uncertain- 
ties, but the correspondence between the model prediction from 
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the reduced model and that of the full model is very good. If 
required, additional parameter identification could be carried out 
on the remaining parameters in the reduced model. As the model 
contains less parameters, parameter identification can be carried 
out more efficaciously. Note that the parameters that are removed 
by our method are those that are badly identified, since their 
absence does not have significant effect on the prediction of the 
system. 

Eventually, we would like to stress that it is always crucial to 
keep in mind which functionality one wants to preserve in the 
reduced model. Otherwise, one may arrive at a reduced model 
that does not serve one's purposes. Therefore, although our 
reduction method does not require prior biological knowledge, if 
there is any, that knowledge should always be taken into account. 

Supporting Information 

Text SI Description of the flower network model in 
Arabidopsis. 

(PDF) 

Figure SI The dynamics of the proteins in each four 
organs. Measured dynamics are denoted by '*' whereas the 
dynamics from the reduced model with k = k 2 are denoted by the 
dashed lines. Parameter fitting is applied to dataset of wildtype (a) 
and knock-out AG (b). The resulted reduced model have a very 
good prediction for mutants knock-out AP3 (c), knock-out PI (d), 
ectopically expression of AP3 (e), and ectopically expression of AG 
(f)- 

(EPS) 

Figure S2 The EGFR biochemical network. A solid arrow 
represents a reaction with two kinetic parameters and a dashed 
arrow represents a reaction with one kinetic parameter. (A) The 
full network from [25], (B) The optimal network to produce the 
dynamics of the five target components for any experimental 
condition eeE in (27), (C) The optimal network as in (B) but with 
an additional constraint to maintain the activation pathway to Ras 
protein. 
(EPS) 

Figure S3 Model discrimination to distinguish the 
reduced model with k = kj from the full model with 

k = k|. In this case, e 2 = {EGF stimu | a tion = 15.3824nM,EGFR 0 
= 141nM,Shco = 0nM,Grb2 o = 340nM}. The new dataset ob- 
tained from an experiment based on the setting e = e 2 is indicated 
by '*'. The dashed curve in the upper left corner shows that the 
reduced model cannot fit this dataset. 
(EPS) 

Figure S4 Result of iterative process to obtain the 
optimal model for EGFR model. The threshold value of 
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